A Generalized Suffix Tree and its (Un)expected Asymptotic Behaviors

نویسنده

  • Wojciech Szpankowski
چکیده

Suux trees nd several applications in computer science and telecommunications, most notably in algorithms on strings, data compressions and codes. Despite this, very little is known about their typical behaviors. In a probabilistic framework, we consider a family of suux trees { further called b-suux trees { built from the rst n suuxes of a random word. In this family a noncompact suux tree (i.e., such that every edge is labeled by a single symbol) is represented by b = 1, and a compact suux tree (i.e., without unary nodes) is asymptotically equivalent to b ! 1 as n ! 1. We study several parameters of b-suux trees, namely: the depth of a given suux, the depth of insertion, the height and the shortest feasible path. Some new results concerning typical (i.e., almost sure) behaviors of these parameters are established. These ndings are used to obtain several insights into certain algorithms on words, molecular biology and universal data compression schemes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Suffix Trees Revisited: (Un)Expected Asymptotic Behaviors

Suffix trees find several applications in computer sciences and telecommunications, most notably in algorithms on strings, data compressions and codes. Despite this, very little is known about their typical behavior. We consider in a probabilistic framework a family of suffix trees further called b-suffix trees built from the first n suffixes of a random word. In this family a noncompact suffix...

متن کامل

Suffix Trees and Simple Sources

Using an intricate method, Jacquet and Szpankowski [2] compared the depth of insertion into suffix-trees and tries in the non-uniform Bernoulli model, as well as the average size of suffix-trees and tries under the same model. They proved that the depth of insertion has asymptotically the same probabilistic behaviour in both cases, and that the average sizes of a trie and a suffix-tree built wi...

متن کامل

Suffix Tree of Alignment: An Efficient Index for Similar Data

We consider an index data structure for similar strings. The generalized suffix tree can be a solution for this. The generalized suffix tree of two strings A and B is a compacted trie representing all suffixes in A and B. It has |A|+ |B| leaves and can be constructed in O(|A|+ |B|) time. However, if the two strings are similar, the generalized suffix tree is not efficient because it does not ex...

متن کامل

Space-efficient K-MER algorithm for generalized suffix tree

Suffix trees have emerged to be very fast for pattern searching yielding O (m) time, where m is the pattern size. Unfortunately their high memory requirements make it impractical to work with huge amounts of data. We present a memory efficient algorithm of a generalized suffix tree which reduces the space size by a factor of 10 when the size of the pattern is known beforehand. Experiments on th...

متن کامل

Compact Suffix Trees Resemble PATRICIA Tries: Limiting Distribution of the Depth

Suffix trees are the most frequently used data structures in algorithms on words. In this paper, we consider the depth of a compact suffix tree, also known as the PAT tree, under some simple probabilistic assumptions. For a biased memoryless source, we prove that the limiting distribution for the depth in a PAT tree is the same as the limiting distribution for the depth in a PATRICIA trie, even...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Comput.

دوره 22  شماره 

صفحات  -

تاریخ انتشار 1993